Goto

Collaborating Authors

 12-lead ecg


Comparative Analysis of Data Augmentation for Clinical ECG Classification with STAR

Nemati, Nader

arXiv.org Artificial Intelligence

Clinical 12-lead ECG classification remains difficult because of diverse recording conditions, overlapping pathologies, and pronounced label imbalance hinder generalization, while unconstrained augmentations risk distorting diagnostically critical morphology. In this study, Sinusoidal Time--Amplitude Resampling (STAR) is introduced as a beat-wise augmentation that operates strictly between successive R-peaks to apply controlled time warping and amplitude scaling to each R--R segment, preserving the canonical P--QRS--T order and leaving the head and tail of the trace unchanged. STAR is designed for practical pipelines and offers: (i) morphology-faithful variability that broadens training diversity without corrupting peaks or intervals; (ii) source-resilient training, improving stability across devices, sites, and cohorts without dataset-specific tuning; (iii) model-agnostic integration with common 1D SE--ResNet-style ECG encoders backbone; and (iv) better learning on rare classes via beat-level augmentation, reducing overfitting by resampling informative beats instead of duplicating whole records. In contrast to global crops, large shifts, or additive noise, STAR avoids transformations that suppress or misalign clinical landmarks. A complete Python implementation and a transparent training workflow are released, aligned with a source-aware, stratified five-fold protocol over a multi-institutional 12-lead corpus, thereby facilitating inspection and reuse. Taken together, STAR provides a simple and controllable augmentation for clinical ECG classification where trustworthy morphology, operational simplicity, and cross-source durability are essential.


Reconstructing 12-Lead ECG from 3-Lead ECG using Variational Autoencoder to Improve Cardiac Disease Detection of Wearable ECG Devices

Guan, Xinyan, Lai, Yongfan, Jin, Jiarui, Li, Jun, Wang, Haoyu, Zhao, Qinghao, Zhang, Deyun, Geng, Shijia, Hong, Shenda

arXiv.org Artificial Intelligence

Twelve-lead electrocardiograms (ECGs) are the clinical gold standard for cardiac diagnosis, providing comprehensive spatial coverage of the heart necessary to detect conditions such as myocardial infarction (MI). However, their lack of portability limits continuous and large-scale use. Three-lead ECG systems are widely used in wearable devices due to their simplicity and mobility, but they often fail to capture pathologies in unmeasured regions. To address this, we propose WearECG, a Variational Autoencoder (VAE) method that reconstructs twelve-lead ECGs from three leads: II, V1, and V5. Our model includes architectural improvements to better capture temporal and spatial dependencies in ECG signals. We evaluate generation quality using MSE, MAE, and Frechet Inception Distance (FID), and assess clinical validity via a Turing test with expert cardiologists. To further validate diagnostic utility, we fine-tune ECGFounder, a large-scale pretrained ECG model, on a multi-label classification task involving over 40 cardiac conditions, including six different myocardial infarction locations, using both real and generated signals. Experiments on the MIMIC dataset show that our method produces physiologically realistic and diagnostically informative signals, with robust performance in downstream tasks. This work demonstrates the potential of generative modeling for ECG reconstruction and its implications for scalable, low-cost cardiac screening.


Translation from Wearable PPG to 12-Lead ECG

Ji, Hui, Gao, Wei, Zhou, Pengfei

arXiv.org Artificial Intelligence

The 12-lead electrocardiogram (ECG) is the gold standard for cardiovascular monitoring, offering superior diagnostic granularity and specificity compared to photoplethysmography (PPG). However, existing 12-lead ECG systems rely on cumbersome multi-electrode setups, limiting sustained monitoring in ambulatory settings, while current PPG-based methods fail to reconstruct multi-lead ECG due to the absence of inter-lead constraints and insufficient modeling of spatial-temporal dependencies across leads. To bridge this gap, we introduce P2Es, an innovative demographic-aware diffusion framework designed to generate clinically valid 12-lead ECG from PPG signals via three key innovations. Specifically, in the forward process, we introduce frequency-domain blurring followed by temporal noise interference to simulate real-world signal distortions. In the reverse process, we design a temporal multi-scale generation module followed by frequency deblurring. In particular, we leverage KNN-based clustering combined with contrastive learning to assign affinity matrices for the reverse process, enabling demographic-specific ECG translation. Extensive experimental results show that P2Es outperforms baseline models in 12-lead ECG reconstruction.


Sensing Cardiac Health Across Scenarios and Devices: A Multi-Modal Foundation Model Pretrained on Heterogeneous Data from 1.7 Million Individuals

Gu, Xiao, Tang, Wei, Han, Jinpei, Sangha, Veer, Liu, Fenglin, Gowda, Shreyank N, Ribeiro, Antonio H., Schwab, Patrick, Branson, Kim, Clifton, Lei, Ribeiro, Antonio Luiz P., Liu, Zhangdaihong, Clifton, David A.

arXiv.org Artificial Intelligence

Cardiac biosignals, such as electrocardiograms (ECG) and photoplethysmograms (PPG), are of paramount importance for the diagnosis, prevention, and management of cardiovascular diseases, and have been extensively used in a variety of clinical tasks. Conventional deep learning approaches for analyzing these signals typically rely on homogeneous datasets and static bespoke models, limiting their robustness and generalizability across diverse clinical settings and acquisition protocols. In this study, we present a cardiac sensing foundation model (CSFM) that leverages advanced transformer architectures and a generative, masked pretraining strategy to learn unified representations from vast, heterogeneous health records. Our model is pretrained on an innovative multi-modal integration of data from multiple large-scale datasets (including MIMIC-III-WDB, MIMIC-IV-ECG, and CODE), comprising cardiac signals and the corresponding clinical or machine-generated text reports from approximately 1.7 million individuals. We demonstrate that the embeddings derived from our CSFM not only serve as effective feature extractors across diverse cardiac sensing scenarios, but also enable seamless transfer learning across varying input configurations and sensor modalities. Extensive evaluations across diagnostic tasks, demographic information recognition, vital sign measurement, clinical outcome prediction, and ECG question answering reveal that CSFM consistently outperforms traditional one-modal-one-task approaches. Notably, CSFM exhibits robust performance across multiple ECG lead configurations from standard 12-lead systems to single-lead setups, and in scenarios where only ECG, only PPG, or a combination thereof is available. These findings highlight the potential of CSFM as a versatile and scalable solution, for comprehensive cardiac monitoring.


High-Throughput Detection of Risk Factors to Sudden Cardiac Arrest in Youth Athletes: A Smartwatch-Based Screening Platform

Xiang, Evan, Wang, Thomas, Poddar, Vivan

arXiv.org Artificial Intelligence

The National Institute of Health defines Sudden Cardiac Arrest (SCA) as a moment when the heart is not beating sufficiently to maintain perfusion due to the heart's electrical or mechanical failure [1]. SCA is the leading cause of death among youth athletes -- a focus group that has a heightened risk of SCA -- with 1 in 16,000 young athletes and 1 in 5200 athletes at the elite level afflicted yearly [1, 2]. For youth athletes, the primary cause of SCA is hypertrophic cardiomyopathy (HCM) in the U.S. and arrhythmogenic right ventricular cardiomyopathy (ARVC) in Europe. SCA may also result from coronary artery disease, Long QT Syndrome, Myocarditis, Wolff-Parkinson-White syndrome, and dilated cardiomyopathy [1-4]. Figure 1 provides a comprehensive list of significant predictors of SCA [5]. While these disorders do not always lead to instances of SCA, they present a substantial increase in the chance of SCA events, which is further amplified by the innate risk of sports participation [6-9]. Concerningly, the current 14-point questionnaire pre-participation evaluation (PPE) recommended by the American Heart Association (AHA) is ineffective at detecting risk factors with poor sensitivity and specificity of 18.8% and 68.0%


Multi-Channel Masked Autoencoder and Comprehensive Evaluations for Reconstructing 12-Lead ECG from Arbitrary Single-Lead ECG

Chen, Jiarong, Wu, Wanqing, Liu, Tong, Hong, Shenda

arXiv.org Artificial Intelligence

In the context of cardiovascular diseases (CVD) that exhibit an elevated prevalence and mortality, the electrocardiogram (ECG) is a popular and standard diagnostic tool for doctors, commonly utilizing a 12-lead configuration in clinical practice. However, the 10 electrodes placed on the surface would cause a lot of inconvenience and discomfort, while the rapidly advancing wearable devices adopt the reduced-lead or single-lead ECG to reduce discomfort as a solution in long-term monitoring. Since the single-lead ECG is a subset of 12-lead ECG, it provides insufficient cardiac health information and plays a substandard role in real-world healthcare applications. Hence, it is necessary to utilize signal generation technologies to reduce their clinical importance gap by reconstructing 12-lead ECG from the real single-lead ECG. Specifically, this study proposes a multi-channel masked autoencoder (MCMA) for this goal. In the experimental results, the visualized results between the generated and real signals can demonstrate the effectiveness of the proposed framework. At the same time, this study introduces a comprehensive evaluation benchmark named ECGGenEval, encompassing the signal-level, feature-level, and diagnostic-level evaluations, providing a holistic assessment of 12-lead ECG signals and generative model. Further, the quantitative experimental results are as follows, the mean square errors of 0.0178 and 0.0658, correlation coefficients of 0.7698 and 0.7237 in the signal-level evaluation, the average F1-score with two generated 12-lead ECG is 0.8319 and 0.7824 in the diagnostic-level evaluation, achieving the state-of-the-art performance. The open-source code is publicly available at \url{https://github.com/CHENJIAR3/MCMA}.


Towards Synthesizing Twelve-Lead Electrocardiograms from Two Asynchronous Leads

Jo, Yong-Yeon, Choi, Young Sang, Jang, Jong-Hwan, Kwon, Joon-Myoung

arXiv.org Artificial Intelligence

The electrocardiogram (ECG) records electrical signals in a non-invasive way to observe the condition of the heart, typically looking at the heart from 12 different directions. Several types of the cardiac disease are diagnosed by using 12-lead ECGs Recently, various wearable devices have enabled immediate access to the ECG without the use of wieldy equipment. However, they only provide ECGs with a couple of leads. This results in an inaccurate diagnosis of cardiac disease due to lacking of required leads. We propose a deep generative model for ECG synthesis from two asynchronous leads to ten leads. It first represents a heart condition referring to two leads, and then generates ten leads based on the represented heart condition. Both the rhythm and amplitude of leads generated resemble those of the original ones, while the technique removes noise and the baseline wander appearing in the original leads. As a data augmentation method, our model improves the classification performance of models compared with models using ECGs with only one or two leads.


Riemannian Prediction of Anatomical Diagnoses in Congenital Heart Disease based on 12-lead ECGs

Alkan, Muhammet, Veldtman, Gruschen, Deligianni, Fani

arXiv.org Artificial Intelligence

Congenital heart disease (CHD) is a relatively rare disease that affects patients at birth and results in extremely heterogeneous anatomical and functional defects. 12-lead ECG signal is routinely collected in CHD patients because it provides significant biomarkers for disease prognosis. However, developing accurate machine learning models is challenging due to the lack of large available datasets. Here, we suggest exploiting the Riemannian geometry of the spatial covariance structure of the ECG signal to improve classification. Firstly, we use covariance augmentation to mix samples across the Riemannian geodesic between corresponding classes. Secondly, we suggest to project the covariance matrices to their respective class Riemannian mean to enhance the quality of feature extraction via tangent space projection. We perform several ablation experiments and demonstrate significant improvement compared to traditional machine learning models and deep learning on ECG time series data.


Branched Latent Neural Maps

Salvador, Matteo, Marsden, Alison Lesley

arXiv.org Artificial Intelligence

We introduce Branched Latent Neural Maps (BLNMs) to learn finite dimensional input-output maps encoding complex physical processes. A BLNM is defined by a simple and compact feedforward partially-connected neural network that structurally disentangles inputs with different intrinsic roles, such as the time variable from model parameters of a differential equation, while transferring them into a generic field of interest. BLNMs leverage latent outputs to enhance the learned dynamics and break the curse of dimensionality by showing excellent generalization properties with small training datasets and short training times on a single processor. Indeed, their generalization error remains comparable regardless of the adopted discretization during the testing phase. Moreover, the partial connections significantly reduce the number of tunable parameters. We show the capabilities of BLNMs in a challenging test case involving electrophysiology simulations in a biventricular cardiac model of a pediatric patient with hypoplastic left heart syndrome. The model includes a 1D Purkinje network for fast conduction and a 3D heart-torso geometry. Specifically, we trained BLNMs on 150 in silico generated 12-lead electrocardiograms (ECGs) while spanning 7 model parameters, covering cell-scale and organ-level. Although the 12-lead ECGs manifest very fast dynamics with sharp gradients, after automatic hyperparameter tuning the optimal BLNM, trained in less than 3 hours on a single CPU, retains just 7 hidden layers and 19 neurons per layer. The resulting mean square error is on the order of $10^{-4}$ on a test dataset comprised of 50 electrophysiology simulations. In the online phase, the BLNM allows for 5000x faster real-time simulations of cardiac electrophysiology on a single core standard computer and can be used to solve inverse problems via global optimization in a few seconds of computational time.


Unlocking the Diagnostic Potential of ECG through Knowledge Transfer from Cardiac MRI

Turgut, Özgün, Müller, Philip, Hager, Paul, Shit, Suprosanna, Starck, Sophie, Menten, Martin J., Martens, Eimo, Rueckert, Daniel

arXiv.org Artificial Intelligence

The electrocardiogram (ECG) is a widely available diagnostic tool that allows for a cost-effective and fast assessment of the cardiovascular health. However, more detailed examination with expensive cardiac magnetic resonance (CMR) imaging is often preferred for the diagnosis of cardiovascular diseases. While providing detailed visualization of the cardiac anatomy, CMR imaging is not widely available due to long scan times and high costs. To address this issue, we propose the first self-supervised contrastive approach that transfers domain-specific information from CMR images to ECG embeddings. Our approach combines multimodal contrastive learning with masked data modeling to enable holistic cardiac screening solely from ECG data. In extensive experiments using data from 40,044 UK Biobank subjects, we demonstrate the utility and generalizability of our method. We predict the subject-specific risk of various cardiovascular diseases and determine distinct cardiac phenotypes solely from ECG data. In a qualitative analysis, we demonstrate that our learned ECG embeddings incorporate information from CMR image regions of interest. We make our entire pipeline publicly available, including the source code and pre-trained model weights.